Machine Learning and Pharmacogenomics at the Time of Precision Psychiatry

Traditional medicine and biomedical sciences are reaching a turning point because of the constantly growing impact and volume of Big Data. Machine Learning (ML) techniques and related algorithms play a central role as diagnostic, prognostic, and decision-making tools in this field. Another promising area becoming part of everyday clinical practice is personalized therapy and pharmacogenomics. Applying ML to pharmacogenomics opens new frontiers to tailored therapeutical strategies to help clinicians choose drugs with the best response and fewer side effects, operating with genetic information and combining it with the clinical profile. This systematic review aims to draw up the state-of-the-art ML applied to pharmacogenomics in psychiatry. Our research yielded fourteen papers; most were published in the last three years. The sample comprises 9,180 patients diagnosed with mood disorders, psychoses, or autism spectrum disorders. Prediction of drug response and prediction of side effects are the most frequently considered domains with the supervised ML technique, which first requires training and then testing. The random forest is the most used algorithm; it comprises several decision trees, reduces the training set's overfitting, and makes precise predictions. ML proved effective and reliable, especially when genetic and biodemographic information were integrated into the algorithm. Even though ML and pharmacogenomics are not part of everyday clinical practice yet, they will gain a unique role in the next future in improving personalized treatments in psychiatry.


INTRODUCTION
The biomedical sciences have always been profoundly and strongly dependent on data, even more in the latest decades with precision medicine's birth and development, which caused the need to collect an increasing number of complex and multi-dimensional data.All this amount of data derives from both microscopic and macroscopic worlds, which led to the need to process a massive amount of information Today we are witnessing the Big Data era in its early stage of development since most of the technologies, practices, and analytical applications appeared around 2010 [4].Big Data means a massive amount of digital data collected from any sources that are raw, unstructured, and too different from each other to be analyzed using conventional statistical and relational techniques [5].All the features of Big Data can be summarized with "the three V's": volume, velocity, and variety.First, volume refers to the massive quantity of data; each organization generates terabytes or petabytes of information.Second, variety describes the different natures of the data themselves.Third, velocity is linked with the insane frequency with which today's data is generated, gathered, and processed.All the value of these data loses importance without an effective system for managing, extracting, and analyzing it [5].The systematic and comprehensive exploration of data is mainly carried out using Artificial Intelligence which provides a mechanism for data-driven hypotheses, experimental planning, precision, and evidence-based medicine.

Precision Psychiatry
Psychiatry is dedicated to understanding mental diseases and assisting those affected in leading gratifying lives.Although current treatment strategies for many mental disorders can be remarkably effective at improving patients' quality of life and mitigating the burden of symptoms, finding the proper treatment for an individual can be a long and arduous process, during which symptoms can worsen and could increase the clinical risk related to other health conditions.
Precision psychiatry is a promising new direction to overcome those limitations [6].It consists of the translation into the clinical psychiatry of the precision medicine methods, thus considering the latest biomarker-based research approaches to accurately assess an individual's risk of developing mental illnesses for preventive purposes.Predictive psychiatry aims to construct clinical and molecular models to better predict individual and varied therapy responses and increase the early detection of mental diseases.Pattern recognition could extract signatures from clinical, cognitive, imaging-based, and, where applicable, genetic data that can be applied quantitatively to individual patients to anticipate desired and undesired pharmacological effects.Another main objective of precision psychiatry is to identify drug treatments that could have better efficacy and tolerability for a specific patient [7].
The most recent innovations come from the fields of pharmacogenomics and Artificial Intelligence, of which ML is currently among the most promising approaches.

Machine Learning
ML is made up of mathematics, statistic, and computer science.It can be considered an engine, a kind of "intelligent" product whose ability is to make accurate and precise predictions based on data from several different sources [8].ML techniques use algorithms that describe the relationships between variables.These algorithms might be represented on a continuum between easy to decode and understand and those with great difficulties in decoding; the whole working system may be compared to a "black box" [9].
Conventional statistical techniques, such as linear and logistic regression, can show the relationship between two variables; then, the inference is about how two data are related.On the other side, ML's primary goal is prediction; here, the main purpose is to assess whether and to what extent some data might predict an event [10].The learning process has a crucial role in achieving a predictive capability and divides ML into two categories: Supervised ML and Unsupervised ML.
Supervised ML is a technique in which a model is trained on a range of features associated with a known outcome.These features might be represented by patients' characteristics or history related to a specific outcome (e.g., weight, BMI, and the onset of diabetes within some years).Once an algorithm is trained, it will predict outcomes when applied to a new data set.Furthermore, predictions can be discrete (e.g., healthy/unhealthy, malignant/benign) or continuous (e.g., range of values) [11,12].Both features and outcomes are organized in a dataset to which an algorithm may be applied.Then the algorithm is improved during its development to be optimized, reducing the risk of giving errors in predictions.
Unsupervised ML, so far, has found few applications in medicine [13].Focusing on unsupervised ML, the main difference with supervised learning is the absence of a predefined outcome.The algorithm gains an exploratory purpose in this situation since the user does not include any output in the dataset.Hence, this kind of learning may have significant implications regarding the most complex pathophysiologic mechanisms and possible new therapeutic paths; on the other hand, the learning process is more difficult to understand and apply in clinical practice.Therefore, due to the inherent unpredictability of the results provided, the application of unsupervised ML in clinical practice still has several issues.

Machine Learning in Pharmacogenomics
Pharmacogenomics is widely considered one of the most promising fields of clinical medicine [14]; it focuses on identifying genomic aspects that could be correlated with drug effects and metabolization.Pharmacogenomics focuses on the role of the genome in drug response.It analyzes how the genetic asset of an individual can affect the response to drugs, having a potentially positive impact on clinical practice, primarily in treatment-resistant mental disorders [15].The most prescribed psychiatric drug in 2015 was sertraline, a member of the selective serotonin reuptake inhibitors (SSRIs) class adopted for depression, obsessive-compulsive disorder, panic disorder, post-traumatic stress disorder, and anxiety disorders.This drug class includes many other molecules, such as fluvoxamine, citalopram, escitalopram, fluoxetine, and paroxetine, most of which demonstrated efficacy in 65% of treated patients or less [16,17].A similar issue exists in treating resistant schizophrenia spectrum disorders, for which the atypical antipsychotic drugs may show low or insufficient efficacy and response rates [18,19].These elements underline the need for a new paradigm in treating psychiatric disorders, switching from the traditional evidencebased approach (based on data gathered in large populations of patients) to an individual-based and data-driven knowledge of clinical and biological data (phenotypical, genotypical, and molecular).Thus, precision medicine's fun-damental consists of tailoring care and focusing on the unique characteristics of patients [20].
Artificial intelligence and ML aim to provide a datadriven algorithm that learns from past and present data to elaborate predictive outcomes for any unknown data or any unknown event in the future [21,22].
Thanks to the recent advances in multi-omics, precision psychiatry is acquiring high growth potential to satisfy the requirements of new drugs and therapeutic interventions [23].Multi-omics currently promises to improve human health and disease knowledge, and many researchers are working on methods to generate and analyze disease-related data.Multi-omics applications improved understanding of host-pathogen interactions, infectious diseases, chronic and complex non-communicable diseases, and personalized medicine.However, the challenges we face today in precision psychiatry are still mostly unmatched, considering mental disorders' multifactorial aetiology.The development of software tools based on artificial intelligence and ML frameworks could help to predict specific quantitative and categorical phenotypes in clinical settings by utilizing nextgeneration technology multi-omics and neuroimaging datasets [24].This study aims to systematically review the current evidence on the use of ML and artificial intelligence in precision psychiatry, underlining the current possibilities and promises of this approach for patients with mental disorders.

METHODS
A systematic review of the literature was conducted to investigate the field of application of ML technology in the study of pharmacogenomics in psychiatry, evaluating types and prospects of application.The examined studies were identified through research in online databases (PubMed, Scopus, Web of Science, CINHAL, and PsycINFO) carried out using the following string: ((machine learning) OR (deep learning) OR (algorithm)) AND pharmacogen* AND (psychiatr* OR mental).
We included articles describing studies focused on the use of ML and artificial intelligence in the field of pharmacogenomics in psychiatry.We excluded articles unrelated to the central issue of this review.
The initial research was completed on November 28, 2021, producing 113 results on PubMed, 27 on Scopus, 29 on Web of Science, ten on CINHAL, 25 on PsycINFO.From these, 26 articles obtained from Scopus have been eliminated since they coincide with results obtained by PubMed; 28 from Web of Science, since similar articles in PubMed and Scopus; finally, ten articles from CINHAL and 25 from PsycINFO since already identified through the other search engines.Therefore, the preliminary investigation was conducted on 115 articles (113 from PubMed, 1 from Scopus, 1 from Web of Sciences).Among all, we excluded 51 articles unrelated to the object of this study, two editorials, 23 reviews, 19 that did not apply an ML method, three related to other fields of medicine, one letter to the editor, one animal study, and one ongoing study.Therefore, the total database included 14 peer-reviewed scientific articles (Fig. 1).

ML Application Domains in Pharmacogenomics
Two domains of pharmacogenomic ML in psychiatry were identified: (i) prediction of drug response (n=12) and (ii) prediction of side effects (n=2).

1) Prediction of drug response includes articles aiming
to identify which genes and subpopulation characteristics are associated with a positive response to a specific drug class or molecule.According to recent studies, individual phenotypic differences may also emerge from epigenetic modifications like histone acetylation or DNA methylation.In addition, noncoding RNA interactions also have a role in protein expression and may alter drug effects.Different promising therapeutic techniques are now being developed, although the role of epigenetics in pharmacological treatment response needs further study [38].The study of gene-gene interactions may better underline individual pharmacokinetic and pharmacodynamic pathways [21].
2) Prediction of side effects studies focuses on which genetic variables might play a role in the onset of undesirable effects due to a specific drug.In both domains, the core question is to find a suitable subpopulation, based on a genomic study, for a specific pharmacological treatment to establish a tailored therapy, theoretically, with no side effects and the best odds of a response.

Prediction of Drug Response
Eugene and colleagues focused on lithium treatment in bipolar and schizoaffective disorder; more specifically, they intended to spotlight the gender-specific transcriptional-level regulators of lithium treatment response [28].They performed 4 Differential Gene Expression Analyses (DGEA).Through DGEA-1, the gender-specific transcriptome was obtained comparing male vs. female; DGEA-2 comparing male nonresponders vs. male responders; DGEA-3 was performed on female non-responder vs. female responders and, finally, DGEA-4 on male responders vs. female responders.The main 250 genes from DGEA-1 to DGEA-4 were then overlaid to result in gender-linked genes related to the response to the treatment with lithium.After identifying the statistically significant DNA microarray genes, two ML algorithms were used for classification: Decision Tree and random forest.They selected the Decision Tree algorithm to classify male versus female samples; more specifically, the Ribosomal protein S4, Y-linked 1 (RPS4Y1) gene expression was ≥ 9.643 in male patients and < 9.643 in female patients with a probability=100%.A random forest algorithm was adopted for classifying male responders and female responders.The RBPMS2 and LILRA5 genes were involved in the lithium response in males with an area under the receiver operator characteristic curve (AUROC) of 0.92, and the ABRACL, FHL3, and NBPF14 genes were found related to female lithium responders with AUROC of 1. RBPMS2 is a gene codifying for an RNA Binding Protein with Multiple Splicing, while LILRA5 codifies for Leukocyte Immunoglobulin Like Receptor A5.ABRACL Codifies for ABRA C-Terminal-like protein, FHL3 for Four and a Half LIM Domains 3, and NBPF14 for Neuroblastoma Breakpoint Family Member 14.
ML-based algorithms analyzing functionally validated pharmacogenomic biomarkers associated with clinical measures could predict the remission/response rate to selective serotonin reuptake inhibitors (SSRIs) in patients affected by MDD [30].Athreya et al. [30] studied 1, 030 MDD patients treated with citalopram/escitalopram from Mayo Clinic Pharmacogenomics Research Network Antidepressant Medication Pharmacogenomic Study (PGRN-AMPS; n = 398), Sequenced Treatment Alternatives to Relieve Depression (STAR*D; n = 467), and International SSRI Pharmacogenomics Consortium (ISPC; n = 165) trials.As pharmacogenomic biomarkers, they included six SNPs, either in or close to the TSPAN5 (rs10516436), ERICH3 (rs696692), DEFB1 (rs5743467, rs2741130, and rs2702877), and AHR (rs17137566) genes.SNPs were identified through a genome-wide association study for PGRN-AMPS plasma metabolites associated with SSRI response (serotonin) and baseline MDD severity (kynurenine) [30,39,40].Unsupervised learning was applied to identify clusters of patients (men and women separately) with similar symptom severity at baseline and after 4 and 8 weeks of treatment.It was applied an Expectation-Maximization (EM) algorithm that assumed only one component in the mixture (a single bell-shaped curve distribution) and gradually increased the number of components (distributions with multiple peaks) until an adequate fit of the data was achieved.Then, they adopted a trained random forest algorithm (random forest R library) using PGRN-AMPS's baseline depression severity and pharmacogenomics data to predict remission/response and then externally validated by the trained prediction model using STAR*D and ISPC data.For both women and men, the top predictor for remission was baseline depression severity, followed by the DEFB1_2 (rs2741130) and DEFB1_1 (rs5743467) SNPs biomarkers identified during our GWAS for plasma kynurenine concentrations.The top SNPs for response in the men group were the TSPAN5 SNPs, related to serotonin concentration, followed by the DEFB1_1 and DEFB1_2 SNPs.For response in women, the top predictor was the DEFB1_1 SNP, followed by baseline depression severity and the DEFB1_2 SNP [30,39,40].
Another 6-week duration cohort study evaluated the therapeutic outcome of different antidepressants [32].The expression of the C allele of rs6354 polymorphism and the G allele of rs12150214 (SLC6A4) showed a poorer treatment response to fluoxetine.The SNPs rs929377-rs6191-rs32897 were also significantly associated with the treatment response to fluoxetine.In female MDD patients, the minor allele of rs6323 and rs1137070 on the MAOA gene showed to be related to a worse response to venlafaxine.
Lin and colleagues tested a wrapper-based feature selection algorithm integrated with a boosting ensemble predictive framework for building predictive models of antidepressant treatment response among 421 MDD patients.Their primary purpose was to compare the efficacy of different ML techniques.This study demonstrated that the ensemble ML framework might be a valuable technique for creating bioinformatics tools for discriminating non-responders from responders before treatment with SSRIs [24].
Another study evaluated five different ML approaches (neural networks, recursive partitioning, learning vector quantization, Gradient boosted machine, and random forest) on three different samples testing 44 SNPs of 8 candidate genes (CACNA1C, CACNB2, ANK3, GRM7, TCF4, ITIH3, SYNE1, FKBP5).FKBP5 polymorphisms seemed effective candidates for inclusion in antidepressant pharmacogenetic tests.Furthermore, pathways including the CACNA1C, a Calcium channel-related gene, could be involved in treatment-resistant depression, which could be considered for developing multi-marker predictors [26].
Kautzky and colleagues focused their attention on treatment-resistant depression [35].They demonstrated that using the random forest algorithm, combining SNPs (12 SNPs in HTR2A, COMT, ST8SIA2, PPP3CC, and BDNF) and clinical variables, it is possible to detect treatment-resistant patients.A combination of two ML models was tested by Maciukiewicz et al. [27], applying classification-regression trees (CRT) and linear support vector machine (SVM) to predict duloxetine response in MDD.Additionally, they used the genome-wide logistic regression to identify potentially significant SNPs variants related to duloxetine response/ remission and extracted the most promising predictors using LASSO regression.CRT performed poorer for remission (accuracy = 0.51, sensitivity = 0.51, specificity = 0.51), when compared with SVM (accuracy = 0.52, sensitivity = 0.58, specificity = 0.46).In response, both algorithms performed poorly.Regarding CRT, models achieved an accuracy = 0.57, a sensitivity = 0.75, and a specificity = 0.15.In SVMs, they observed an accuracy = 0.64, a sensitivity = 0.87, and a specificity = 0.07.For remission, the SVM models achieved an accuracy = 0.41, a specificity = 0.43, and a sensitivity = 0.41.In conclusion, SVM models based on predefined classes perform significantly better.
Joyce and colleagues explored the application of ML tools in combined pharmacological treatment in MDD [34].In detail, they examined data from 264 MDD patients treated with citalopram or escitalopram deriving from Mayo Clinic PGRN-AMPS and 111 MDD patients under treatment of a combined antidepressant therapy from Combined Medication to Enhance Outcomes of Antidepressant Therapy (CO-MED) study.The central hypothesis of Joyce and colleagues is that enriching clinical measures with biological ones (such as metabolomics and genomics) might improve the predictability of response to combined antidepressant therapies.They applied a first model made up of clinical, sociodemographic, and metabolomic (plasma metabolites) aspects and a second model, additionally considering six validated SNPs related to MDD pathophysiology and citalopram/escitalopram response.These SNP biomarkers are located near or in TSPAN5, ERICH3, DEFB1, and AHR genes [39][40][41].Both linear and non-linear algorithms were tested.A linear regression model was successful at predicting changes in symptoms' scores using clinical and metabolomic features; they then tested extreme gradient-boosted decision tree-based ensembles (XGBoost) as nonparametric models.Nonparametric models identified possible non-linear relationships among predictors while predicting treatment outcomes.Finally, a cross-trial replication was conducted, showing that integrating data on specific metabolites and SNPs achieves more accurate treatment response predictions across classes of antidepressants [34].
Taliaz and his group used STAR*D patients' data for algorithm assembly and evaluation; they randomly divided 530 patients into a validation group of 271 and a test group of 259 [33].They further proceeded with external validation of their ML tool, used on data from the PGRN-AMPS of patients treated with citalopram.They used several ML algorithms: SVM with a linear kernel, XGBoost, random forest, and Adaptive Boosting (AdaBoost).In addition, 5-or 10fold repeated cross-validations (CVs) were performed on the training datasets to reach optimal parameters; these were used to re-train the various models using the complete training datasets.The authors considered 8, 210 SNPs, deriving from STAR*D genetic data and genetic data and the Genome Reference Consortium Human genome, obtaining highly similar results for STAR*D and PGRN-AMPS test sets, with good accuracy.These findings support the feasibility of using ML algorithms applied to large datasets with genetic, clinical, and demographic features to improve accuracy in antidepressant prescription [42].
The only study focused on the depressive symptoms among two different diagnosis groups (MDD and BD) was the one conducted by Borro et al. [31].All the recruited patients were pharmacoresistant, with at least three previous failed treatments.A new algorithm-based tool, Drug-PIN, was employed to re-evaluate and optimize therapies.They compared results from Drug-PIN with the ones obtained by therapy counseling.The number of baseline poly-therapies classified as low-, moderate-or high-risk did not change significantly between the manual system or the Drug-PIN system.As the counseling process, also the Drug-PIN system showed a significant decrease in the predicted treatmentassociated risk.In summary, this informatic tool seems to replicate traditional counseling, virtually reducing time and the risk of mistakes in everyday clinical practice.
Switching attention to antipsychotic drugs, Lee and colleagues developed a computational algorithm to personalize schizophrenia treatment [25].This kind of algorithm was first adopted to identify who benefits most from the treatment group in clinical trials [43] and it is based on a classical clustering algorithm called the partitioning around medoids (PAM) algorithm.It uses both clinical profile and genetic information with two sets of SNPs.The authors proposed a computational algorithm that simultaneously used genetic information and clinical profiles to predict who will or will not benefit from a specific antipsychotic medication among patients with schizophrenia.The model provided a good prediction for Ziprasidone by 13 SNPs and 53 baseline variables [44,45].
Only a study focused on ASD [37].An SNP ranking algorithm was used based on a linear SVR with MATLAB 2014 on the LIBSVM package.The SNP information was binarized and divided into the SNP response data to test and train data in a leave one out cross-validation.The training data fitted the binary information of SNPs to the oxytocin efficacy with SVR and calculated the mean square error.This procedure was repeated for each SNP.The set of most informative SNPs based on the top 10% ranking was chosen.For SVR, there were 4 clusters, and calculated the weight of every SNP for each cluster.They evaluated the relationship between 27 OXTR SNPs and six types of behavioral/neural response to oxytocin treatment in 38 ASD patients.It came out that major alleles of several prominent OXTR SNPs, including the rs53576 and rs2254298, were related to the oxytocin effect.We resumed the main results of ML studies in psychiatry in Tables 1 and 2.

Prediction of Side Effects
Boloc and colleagues used ML techniques to test whether it can predict the side effects of drug treatment, for example, the extrapyramidal symptoms that may occur during an antipsychotic treatment [29].Supervision methods of class prediction based on ML were applied.ML has been trained to identify control and case classification patterns using the Discovery Sample of the SNPassoc R package.Support vector machine, Naive Bayes, and random forest were adopted, showing a better EPS prediction.The Naive Bayes achieved the best result.The exact purpose was pursued by Son et al. [36].They investigated the polymorphisms associated with tardive dyskinesia in patients treated with typical antipsychotics and predicted it using the MDR (multifactor dimensionality reduction).MDR is nonparametric and model-free, made up of two stages.Stage 1 involves choosing the best combination of factors, and stage 2 involves classifying the combinations of genotypes into high-risk and low-risk groups based on the ratio of cases to controls with that genotype [46].MDR ultimately selects one genetic model, single or multi-loci, which predicts phenotype with good success.The model's predictive ability was assessed using the 10-fold cross-validation.Statistical significance is determined empirically by permuting the case and control labels 1000 times.SCL6A11 genotypes distribution showed a significant difference between patients with and without tardive dyskinesia (TD), providing significant evidence for gene-gene interactions (SCL6A11, GABRG 3, and GABRB2) in its development [36].

DISCUSSION
In almost all these papers, ML algorithms showed promising results when combined with either pharmacogenomic information alone or clinical features.
The random forest algorithm seems to be the most adopted technique [26, 28-30, 32, 35].The random forest classifier is a data mining method that offers superior classification performance than other innovative algorithms [47].These properties have made random forests increasingly popular in the last few years, especially in psychiatry [48].The expression "random forest" derives from being made of many trees; more specifically, random forest is a classifier consisting of a collection of tree-structured classifiers made of independent, identically distributed random vectors.Each tree casts a unit vote for the most popular class [49].

The use of clinical applications based on ML techniques considering pharmacogenomic data
is not yet about to be used in everyday clinical use.All the studies adopted supervised ML technologies, where the outcome is already known, and artificial intelligence is trained to understand and predict such outcomes.Only Athreya and colleagues used unsupervised ML [30], but to test whether the distribution of symptom severity scores was expected; in a second step, they applied the trained random forest.Then, although ML currently has an investigative role, once it is trained on large numbers and a homogenous population, its diagnostic and predictive function will become more reliable and could be better used in clinical practice as a diagnostic tool [10][11][12].
In 2013 the FDA listed that pharmacogenomic testing should be used in early-phase clinical trials for the identification of suitable populations, cohorts, and individuals "that should receive lower or higher doses of a drug, or longer titration intervals, based on genetic effects on drug exposure, dose-response, early effectiveness and common adverse reactions" [50].However, this innovative approach has not been widely adopted by pharmaceutical companies yet, both for the risk of reducing its potential market size and the lack of available extensive genomic data resources, data heterogeneity, and the absence of universal benchmarks [51].
Genomics represents a critical but small part of data needed for patient stratification, which involves heterogeneous biomedical, demographic, and sociometric data and effective predictive ML models.Despite not being designed for research application, substantial amounts of data within electronic health records have been proven for use through several notable studies in GWAS and phenome-wide association studies analysis [52].Furthermore, studies on EHR-linked (Electronic Health Records) DNA biorepositories have successfully shown that integrating such pharmacogenomic and sociometric data can be helpful in predictive modeling for optimizing dosage and reducing dosing error [52].By using clinically available information, such as age, gender, and education, healthcare providers and clinical researchers can identify better treatment options and patient responses to maximize efficacy and cost-effectiveness [52,53].
However, several challenges are associated with the effective integration of EHR data with pharmacogenomics applications.For example, because of the high dimensionality of the EHR data structure, background noise, heterogeneity, shortage, incompleteness, random error, and systematic biases [54], extraction of relevant clinical phenotypes may require advanced computational models.Ongoing research in this field and recent advance in deep learning prove the potential of deep learning to overcome these difficulties and learn patient data representations that are useful for treatment response, adverse effects, and outcome prediction [51].Recent applications include the extraction of general-purpose representations of patients from EHRs, often performed with generative models trained either on permanent or temporal data [51].These models can uncover patterns in sparse, complex, heterogeneous datasets and produce surrogate patient phenotypes.Both are unsupervised, such as Deep  Patient [54] and semi-supervised, for example, Denoising Autoencoder for Phenotype Stratification [55].These models rely on an autoencoder network structure to model EHR data for deriving patient representations predictive of final diagnosis, and different outcomes (for example, drug response, mortality, adverse events, and hospitalization risk).As generative deep model development progresses quickly, applications of novel architectures, such as Generative Adversarial Networks, to EHR data are starting to emerge, demonstrating improved performance for the disease prediction [56] and risk prediction given treatment [57].
Even though ML seems promising, the application of deep learning algorithms in mental health is still in its first stages, with limited exploration.ML is yet treated as a black box by researchers, making this approach hard to be understood in  terms of how and why these deep learning techniques work [58].In this specific context, it is impossible to identify and differentiate which mechanisms are crucial in predicting therapy response and side effects depending on the diagnosis.
Moreover, most of the research on ML is still in the proof-of-concept stage, and there needs to be more real-life testing.Psychiatric diseases are a complex phenomenon with different aspects and variables (i.e., biological, social, psychological) concurring in causing such illnesses.Dang and colleagues showed that there is no standard way to gather high-quality data.There is difficulty in achieving the labels, which causes ML approaches to be uncertain, with the need for acknowledging the best practices in handling ML models [59].Such difficulties and reasons might affect the application of ML models in everyday clinical practice.In order to make this tool effective and powerful, collecting a significant volume of high-quality data is essential.However, collecting a more detailed and high volume of data requires collaboration with institutions and a great effort, far from being easy [60].
The picture provided by our review is comprehensive yet must be considered, given its limitations.First, 8050 of 9180 enrolled subjects were affected by MDD, 684 had a psychotic disorder, 60 were diagnosed with BD, and only 38 showed autistic features.These results mean that the application of ML on pharmacogenomic, so far, has been tested mainly on a few mental disorders.Further studies must be conducted to explore the feasibility of this tool in tailoring pharmacological treatments for other diseases.

LIMITS IN THE CURRENT APPLICABILITY OF MACHINE LEARNING ALGORITHMS IN PSYCHI-ATRY
ML provides exciting potential for detecting, preventing, and treating psychiatric disorders.However, many factors limit its current use in research and practice [61][62][63].Data is arguably the most apparent constraint in developing ML models for diagnosing and treating psychiatric disorders [61].In psychiatry, we do not have the comfort of rich numerical datasets such as those available in intensive care units.Large datasets with diverse participants are needed to create accurate ML models.Best practice guidelines have been published for developing and reporting ML models in biomedical research [61].
Ethical concerns must be addressed before ML models are used in psychiatric disorders.Privacy and digital data security may affect a person's mental health.Furthermore, many concerns are related to the bias created in ML models that may disadvantage underrepresented groups [62].Like humans, all ML models have some degree of error, and that error could be associated with significant clinical issues.Identifying Mental Health disorder risks based on digital data, such as that on social media, and providing helpseeking information may also be distressing for those who were unaware of their vulnerability [61].The models are reliant on the availability of the specific predictors used to create them.The more intricate, timely, and costly the predictors are to collect and input into the ML algorithms, the harder they will be to utilize in practice [63].ML models that rely on social media data may only be helpful for active users of those platforms [61].Once the specific predictors have been collected, they need input into the ML algorithms.Ideally, this process would be automated, but it may be challenging if the ML models rely on predictors from different sources [64].Simple interfaces should be created to allow humans to enter data without requiring extensive technical training [64].
Finally, and probably the most challenging issue to be technically addressed, is the clinicians' trust in the algorithm's capabilities [65].Clinicians may not value the recommendations made by ML models or rely solely on them at the expense of their clinical judgment.An adequate balance between the algorithm's diagnostic power and the clinician's judgment is necessary for a field such as psychiatry.In mental health disorders, the diagnostical and therapeutical outcomes cannot always be mathematically defined, and the clinicians' human qualities are still often needed to reach a satisfactory clinical outcome [65].

CONCLUSION
Precision psychiatry could already be considered a valuable clinical instrument in treating drug-resistant forms of many psychiatric disorders, such as MDD, BD, and psychoses.Indeed, it provides personalized therapy with improved efficacy and reduced adverse drug reactions by correlating genotype with clinical phenotype.Pharmacokinetic pharmacogenetic tests that combine different genomic variants show the most clinical utility.These tools are supposed to supplement rather than replace prescriber decisions, with clinical judgment remaining critical in decision-making.Pharmacogenomics could improve shared decision-making and risk-benefit analyses in medication selection.Moreover, the contamination of pharmacogenomics and ML in psychiatry may enlighten the development of clinical applications aimed at improving the choice of a drug treatment that could have an optimal outcome and tolerability.By the way, to reach the full potential of precision psychiatry, further research is needed to combine biodemographic data with multi-omics biomarkers and the whole spectrum of gene interactions using the latest AI computational strategies.

FUNDING
None.

CONFLICT OF INTEREST
In the last two years, A.D.C. has provided lectures or received advisory board honoraria, or engaged in clinical trial activities with Fidia, which did not influence the content of this article; R.P. is a member of the advisory board of Drug-PIN AG; this role did not influence the content of this study; M.P. has provided lectures or received advisory board honoraria or engaged in clinical trial activities with Angelini, Lundbeck, Janssen, Otsuka, and Allergan, which did not influence the content of this article.All other authors have no conflicts concerning the subject matter of this study.